-
Notifications
You must be signed in to change notification settings - Fork 146
fix: handle kubernetes describe failures gracefully (#1150) #1151
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
For your consideration @kiukchung @d4l3k |
|
|
|
when the job is not found describe API should return None no? |
|
Do you mean |
I mean that if volcano raises an exception when the job does not exist, then torchx's |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1151 +/- ##
=======================================
Coverage 91.56% 91.57%
=======================================
Files 83 83
Lines 6593 6599 +6
=======================================
+ Hits 6037 6043 +6
Misses 556 556
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Oh, that's a good point, @kiukchung Let me update this PR then! |
bad0ca7 to
09d550e
Compare
|
This looks perfect, @kiukchung! |
cool! nit: can we log the caught exception for debugging purposes? or better just let other errors throw |
09d550e to
61107b6
Compare
|
Done, @kiukchung I agree it's cleaner |
|
@clumsy can you rebase on top of |
61107b6 to
1c6204e
Compare
|
Done, @kiukchung ! |
When job is not found we will show State=UNKNOWN and surface the error in Msg
Test plan:
[x] added unit test
[x]
torchx status sfai_kubernetes://torchx/real-namespace:bogus-job-id